A clustering ensemble: Two-level-refined co-association matrix with path-based transformation
نویسندگان
چکیده
The aim of clustering ensemble is to combine multiple base partitions into a robust, stable and accurate partition. One of the key problems of clustering ensemble is how to exploit the cluster structure information in each base partition. Evidence accumulation is an effective framework which can convert the base partitions into a co-association matrix. This matrix describes the frequency of a pair of points partitioned into the same cluster, but ignores some hidden information in the base partitions. In this paper, we reveal some of those information by refining the co-association matrix from data point and base cluster level. From the data point level, as pairs of points in the same base cluster may have varied similarities, their contributions to the co-association matrix can be different. From the cluster level, since the base clusters may have diversified qualities, the contribution of a base cluster as a whole can also be different from those of others. After being refined, the co-association matrix is transformed into a pathbased similarity matrix so that more global information of the cluster structure is incorporated into the matrix. Finally, spectral clustering is applied to the matrix to generate the final clustering result. Experimental results on 8 synthetic and 8 real data sets demonstrate that the clustering ensemble based on the refined co-association matrix outperforms some state-of-the-art clustering ensemble schemes. & 2015 Elsevier Ltd. All rights reserved.
منابع مشابه
The ensemble clustering with maximize diversity using evolutionary optimization algorithms
Data clustering is one of the main steps in data mining, which is responsible for exploring hidden patterns in non-tagged data. Due to the complexity of the problem and the weakness of the basic clustering methods, most studies today are guided by clustering ensemble methods. Diversity in primary results is one of the most important factors that can affect the quality of the final results. Also...
متن کاملانتخاب خوشههای اولیه به کمک الگوریتمهای هوشمند برای مشارکت در خوشهبندی ترکیبی
Most of the recent studies have tried to create diversity in primary results and then applied a consensus function over all the obtained results to combine the weak partitions. In this paper a clustering ensemble method is proposed which is based on a subset of primary clusters. The main idea behind this method is using more stable clusters in the ensemble. The stability is applied as a goodnes...
متن کاملخوشهبندی ترکیبی مبتنی بر زیرمجموعهای از خوشههای اولیه
Most of the recent studies have tried to create diversity in primary results and then applied a consensus function over all the obtained results to combine the weak partitions. In this paper a clustering ensemble method is proposed which is based on a subset of primary clusters. The main idea behind this method is using more stable clusters in the ensemble. The stability is applied as a goodnes...
متن کاملWeighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering
Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...
متن کاملLCE: a link-based cluster ensemble method for improved gene expression data analysis
MOTIVATION It is far from trivial to select the most effective clustering method and its parameterization, for a particular set of gene expression data, because there are a very large number of possibilities. Although many researchers still prefer to use hierarchical clustering in one form or another, this is often sub-optimal. Cluster ensemble research solves this problem by automatically comb...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 48 شماره
صفحات -
تاریخ انتشار 2015